首页> 外文OA文献 >Arabic OCR Error Correction Using Character Segment Correction, Language Modeling, and Shallow Morphology
【2h】

Arabic OCR Error Correction Using Character Segment Correction, Language Modeling, and Shallow Morphology

机译:使用字符段校正,语言建模和浅层形态的阿拉伯语OCR纠错

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper explores the use of a character segment based character correction model, language modeling, and shallow morphology for Arabic OCR error correction. Experimentation shows that character segment based correction is superior to single character correction and that language modeling boosts correction, by improving the ranking of candidate corrections, while shallow morphology had a small adverse effect. Further, given sufficiently large corpus to extract a dictionary and to train a language model, word based correction works well for a morphologically rich language such as Arabic.
机译:本文探讨了基于字符段的字符校正模型,语言建模和浅层形态在阿拉伯OCR错误校正中的使用。实验表明,基于字符段的校正优于单个字符校正,并且语言模型通过提高候选校正的等级来增强校正,而浅层形态的不良影响较小。此外,给定足够大的语料库来提取字典和训练语言模型,基于单词的校正对于形态丰富的语言(例如阿拉伯语)非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号